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In the COGITO study (Schmiedek, Lovden, & Lindenberger, 2010), 101 younger adults practiced 12 
tests of perceptual speed, working memory, and episodic memory for over 100 daily 1-hr sessions. The 
intervention resulted in positive transfer to broad cognitive abilities, including reasoning and episodic 
memory. Here, we examine whether these ability-based transfer effects are maintained over time. Two 
years after the end of the training, 80 participants returned for follow-up assessments of the compre- 
hensive battery of transfer tasks. We found reliable positive long-term transfer effects for reasoning and 
episodic memory, controlling for retest effects by including participants from the original control group. 
This shows, for the first time, that intensive cognitive training interventions can have long-term broad 
transfer at the level of cognitive abilities. 

Keywords: cognitive training, cognitive abilities, transfer effects, latent change score models, long-term 
effects 



Attempts to improve cognitive functioning with training inter- 
ventions have a long history in psychology. For many years, 
interventions used strategy instruction and practice on tasks from 
psychometric test batteries of cognitive abilities, and at most these 
interventions produced transfer effects (i.e., improvements on un- 
trained tasks) that must be considered narrow (Noack, Lovden, 
Schmiedek, & Lindenberger, 2009). More recently, however, cog- 
nitive training research has produced a number of findings that 
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paint a more positive picture of the effectiveness of practice- 
induced changes of cognitive functioning. The most promising 
findings come from trainings that (a) build on self-guided practice, 
rather than instruction of strategies (cf. Hofland, Willis, & Baltes, 
1981); (b) focus on the core capacities of working memory (WM; 
e.g., Dahlin, Stigsdotter-Neely, Larsson, Backman, & Nyberg, 
2008; Jaeggi, Buschkuehl, lonides, & Perrig, 2008; Klingberg et 
al., 2005; see Morrison & Chein, 2011, for review) or executive 
functions like task switching (Karbach & Kray, 2009); and (c) use 
computerized setups that adapt task difficulties to a continuously 
challenging level. Holding individualized task-difficulty up high 
creates a continuous mismatch of cognitive demands and individ- 
ual functional supplies. Such mismatches, if present for a pro- 
longed period, could have the potential to improve cognitive 
processing efficiency rather than merely exploiting the available 
behavioral flexibility with effective, but typically task-specific, 
strategies (Lovden, Backman, Lindenberger, Schaefer, & Schmie- 
dek, 2010). As of recently, failed replications of WM training 
studies have also been reported (Chooi & Thompson, 2012; Redick 
et al., 2012), and critical reviews on WM training have appeared 
(Melby-Lervag & Hulme, 2013; Shipstead, Hicks, & Engle, 2012; 
Shipstead, Redick, & Engle, 2012). Thus, the jury on the effec- 
tiveness and efficiency of cognitive training is still out and await- 
ing further empirical evidence that allows evaluating its useful- 
ness. 

To be of practical relevance for everyday competencies, 
training-induced changes need to meet two criteria. First, changes 
need to be located at the level of broad cognitive abilities, that is, 
they have to reach beyond the acquisition of task-specific skills. 
Second, changes need to be enduring, that is, maintained for some 
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time after the training intervention has ended (cf. Sternberg, 2008). 
Ideally, training interventions enhance the long-term trajectory of 
cognitive development, foster success in educational and profes- 
sional settings, and extend the period in old age during which 
individuals are able to live independently (Hertzog, Kramer, Wil- 
son, & Lindenberger, 2008). 

Empirically, the first criterion can be evaluated by investigating 
the range of transfer effects. Effects observed on individual trans- 
fer tasks, however, provide only weak evidence for improvements 
in general cognitive abilities. If an ability (e.g., reasoning) had 
indeed improved, one would expect that performance on indicator 
tasks (e.g.. Raven's Advanced Progressive Matrices; Raven & 
Horn, 2009) of this ability should improve. However, because 
performance on observed tasks can be influenced by factors be- 
yond the underlying ability, like measurement error or task- 
specific skills, the practice of relying on individual indicators of a 
given ability can easily lead to false positive findings (e.g., im- 
provements due to the acquisition of task-specific skills) as well as 
negative findings (e.g., due to lack of power because of improve- 
ments in ability being blurred by task-specific variance and mea- 
surement error) regarding the question of whether the underlying 
ability has improved. 

Therefore, studies on transfer of training need to investigate 
whether transfer can be discerned at the level of cognitive abilities 
(Lovden et al., 2010; Noack et al., 2009; Schmiedek, Liivden, & 
Lindenberger, 2010; Shipstead et al., 2012). This requires assess- 
ing transfer with broad selections of heterogeneous tasks that cover 
the range of the target ability in a comprehensive manner and test 
changes at the level of common factors of these tasks. Such 
common factors represent sources of variance that are shared 
across tasks and are therefore free from measurement error and 
task-specific influences. Demonstrating transfer at this level pro- 
vides a more solid basis for concluding that ability has improved 
than focusing on the task level. 

Using data from the COGITO study, in which 101 younger and 
103 older adults practiced a battery of 12 cognitive tasks over 100 
daily sessions, Schmiedek et al. (2010) could show that a cognitive 
intervention can result in transfer at the ability level for reasoning 
(i.e., fluid intelligence) and episodic memory in healthy younger 
adults. In addition, transfer was observed on a factor of WM tasks 
in both age groups. The tasks comprising this factor were struc- 
turally similar to the trained ones but differed in task content. 
Transfer of training was not reliable for reasoning and episodic 
memory in the older adults, and for perceptual speed as well as for 
a factor of complex span tasks of WM in both age groups. 

Regarding the criterion of temporal preservation, there is evi- 
dence that improvements can be maintained up to several years, 
particularly for improvements on the trained tasks (e.g., Ball et al., 
2002) and for specific strategies and skills (e.g., Brehmer et al., 
2008; Klauer & Phye, 2008; Stigsdotter-Neely & Backman, 1993). 
For long-term transfer effects, empirical evidence is scarcer. There 
is some indication that transfer effects can be maintained up to 18 
months (e.g., Borella, Carretti, Riboldi, & De Beni, 2010; Dahlin, 
Nyberg, Backman, & Stigsdotter-Neely, 2008; Holmes, Gather- 
cole, & Dunning, 2009; Li et al., 2008). Regarding the question of 
transfer breadth, earlier studies are of limited value because they 
were either confined to near transfer or to single indicator tasks per 
target ability. 



It is completely unknown whether transfer at the level of latent 
ability factors induced by cognitive interventions can be main- 
tained over longer periods of time (e.g., years). The COGITO 
study provides an opportunity to address this question because 
participants of the training and control groups came back for 
follow-up assessments of the transfer tasks about 2 years after 
posttest. Sample sizes at follow-up were sufficiently large to 
investigate long-term transfer effects at the ability level using 
latent change score models (McArdle, 2009; McArdle & Prindle, 
2008). These models have the advantage of allowing to directly 
test transfer effects at the latent factor level, which no longer 
contains task-specific sources of variance or measurement error 
(see Figure I). We predicted that the pattern of positive transfer at 
the factor level at follow-up (i.e., changes from pretest to 
follow-up for the training group minus corresponding changes for 
the control group) that we observed at posttest would be main- 
tained at follow-up. As no reliable transfer effects for the abilities 
of episodic memory and reasoning could be demonstrated for the 
older adults at posttest, we restricted our analyses to the younger 
adults. 

Method 

Participants and Procedure 

During the training phase, 101 younger adults (51.5% women, 
^aqe ^ 25.6 years, SD„^^ = 2.7, range: 20-31 years) completed an 
average of 101 practice sessions {SD = 2.6, range: 87-109). 
Participants in the no-contact control group were 44 younger 
adults (47.7% women, M^,^^ = 25.2 years, SD^^^ = 2.5, range: 
21-29 years). Before and after the training, participants completed 
pre- and posttests during 10 sessions that consisted of 2-2.5 hr of 
comprehensive cognitive test batteries and self-report question- 
naires. On average, time elapsing between pre- and posttest was 
197 versus 193 days for the training and control groups, respec- 
tively. Additional information on sample characteristics and study 
dropout can be found in Schmiedek, Lovden, and Lindenberger 
(2010) and Schmiedek, Bauer, Lovden, Brose, and Lindenberger 
(2010). 

The cognitive assessment of the posttest sessions was repeated 
at the 2-year follow-up (time from posttest to follow-up: M„„,^ = 
755 days, Mdn = 749 days, range: 679-927 days, for the training 
group; M,„„^ = 745 days, Mdn = 742 days, range: 693-798 days, 
for the control group). Participation rates at follow-up were satis- 
factory (80 younger adults in the training and 32 in the control 
group, corresponding to 79% and 73% of the original sample sizes, 
respectively). Comparisons of pretest performance on the transfer 
tasks and on the Digit-Symbol Substitution Test (Wechsler, 1981) 
showed that the follow-up sample did not differ significantly from 
the dropouts between posttest and follow-up (ps > .05), with the 
exception of numerical reasoning, for which the follow-up sample 
had significantly higher performance at pretest than did the dropouts, 
t(99) = 2.22, p = .028. The present analyses were confined to the 
foUow-up sample. Within this sample, pretest differences on the trans- 
fer tasks and the Digit-Symbol Substitution Test between the 
trained and control groups were not significant (ps > .05). 
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Figure 1. Latent change score model for modeling training-induced changes at the latent factor level. Squares 
represent observed variables, circles represent latent factors, and the triangle serves to represent information 
regarding means and intercepts. AI-A3, BI-B3, C1-C3 = observed indicator variables A, B, and C (i.e., tasks 
of one ability) measured at the three time points; F1-F3: latent factor of ability at the three time points; LCI: 
latent change factor from pretest to posttest; LC2: latent change factor from pretest to follow-up; a: latent mean 
of ability factor at pretest; (3: mean change of latent ability factors from pre- to posttest; y: mean change of latent 
ability factors from pretest to follow-up; 8: variance (individual differences) in latent ability at pretest; variances 
of the latent change factors was fixed to zero, because they were not significant. Loadings of observed variables 
on latent factors, intercepts of observed variables, and residual variances were fixed to be the same across the 
three time points and across training and control groups (i.e., strict measurement invariance). Residuals for the 
same observed variable were allowed to correlate across time points. 



Tasks 

In each session, participants practiced 12 different computerized 
tasks with two to eight blocks each. For perceptual speed, those 
were three two-choice reaction tasks (odd vs. even numbers; 
consonants vs. vowels; symmetric vs. asymmetric figures) and 
three comparison tasks (two strings of digits/consonants, or two 
three-dimensional figures). For episodic memory, tasks required 
participants to memorize word lists, number-word pairs, or object 
positions in a grid. WM tasks were adapted versions of the alpha 
span, numerical memory updating, and spatial n-back tasks (for 
details of all tasks, see Schmiedek, Lovden, & Lindenberger, 
2010). Difficulty levels for the choice-reaction, episodic memory, 
and WM tasks were individualized using different presentation 
times based on pretest performance. 

Transfer tasks included computerized tasks as well as 27 tasks 
from the paper-and-pencil Berlin Intelligence Structure (BIS) test 
(Jager, SUB, & Beauducel, 1997). The three near transfer WM tasks 
were based on the same three paradigms as the practiced WM tasks, 
but used different content material. The far transfer WM tasks were 
established complex span tasks (reading span, counting span, and 
rotation span). For episodic memory, one computerized word 
paired-associates task and nine tasks from the BIS (three for each 
content domain) were used. Transfer in reasoning was measured 
with 15 items from the Raven's Advanced Progressive Matrices 
(Raven & Horn, 2009) as well as with nine tasks from the BIS, 
three for each content domain. 



Data Analysis 

Effect sizes (d) for single tasks were calculated as mean pre-post 
(pre-follow-up) differences in accuracy divided by the SD of the 
experimental group at pretest. Net effects provided in Table 1 were 
obtained by subtracting the effect sizes for the control from those 
of the training group. Whether these net effects were statistically 
significant was investigated by testing the interaction of occasion 
and group with linear mixed effect models (using PROC MIXED 
in SAS 9.3; Kenward-Roger degrees of freedom; see Littell, Mil- 
liken, Stroup, Wolfinger, & Schabenberger, 2006) that allowed for 
different variances at pre- and posttest (F tests for the interaction 
are provided in Table 1). Effects at the latent level were analyzed 
with latent change score models (McArdle, 2009; McArdle & 
Prindle, 2008). In these models, latent factors were defined by a set 
of transfer tasks. Improvements at the latent factor level were 
captured by the means of latent change score factors (see Figure 1). 
In order for these means to be readily interpretable, it is necessary 
that factor loadings and intercepts are constrained to be equal 
across occasions and experimental groups (strong measurement 
invariance). Here, we even aimed for strict measurement invari- 
ance (i.e., residual variances also fixed across occasions and ex- 
perimental groups). Tests of whether mean changes at the latent 
factor level were significant were conducted by comparing the 
-2LL of models in which means of the latent change factor were 
estimated separately for the training and control groups with 
models in which both means were constrained to be equal, result- 
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Table 1 

Transfer Effects for Follow-Up Sample and Individual Tasks at Posttest and Follow-Up 





Pre-post net 




Pre-Post X 


Pre-foUow-up net 




Pre-FoUow-up X 


Task 


effect size 




Experimental Group 


effect size 


Experimental Group 


Working memory — Near 














Animal span 


.02 




110) = 0.01, ns 


-.06 


Fih 


110) = 0.11, ns 


N-back numerical 


.41 




110) = 6.21, p = .014 


.46 




110) = 9.07, p = .003 


Memory updating spatial 


.07 


F{i, 


124) = 0.18, ns 


-.05 


F(l, 


124) = 0.06, ns 


Working memory — Far 














Reading span 


.00 


F{1, 


124) = 0.00, ns 


.31 


Fih 


124) = 1.72, ns 


Counting span 


.03 




124) = 0.03, ns 


.24 


F{1, 


124) = 1.24, ns 


Rotation span 


.08 


F{h 


124) = 0.28, ns 


.04 


F(l, 


124) = 0.08, ns 


Reasoning 














Verbal 


.12 


F(h 


110) = 1.38, ns 


.22 


Fil 


110) = 4.14, p = .044 


Numerical 


.25 


F(l, 


110) = 5.40, p = .022 


.32 


F{1, 


110) = 7.11, p = .009 


Figural/spatial 


.23 


F(l, 


110) = 3.68, ns 


.27 


F(l, 


110) = 7.30, p = .008 


Raven 


.21 


Fil, 


109) = 1.58, ns 


.40 


F(l, 


107) = 3.90, ns 


Memory 














Verbal 


.49 


Fih 


110) = 17.09, p <.0001 


.15 


F(h 


110) = 1.68, ns 


Numerical 


.53 


F{1, 


110) = 11.15, p = .001 


.16 


F(l, 


110) = 1.20, ns 


Figural/spatial 


.20 


F{1, 


110) = 3.42, ns 


.21 


F(l, 


110) = 3.43, ns 


Word pairs 


.22 


F(l, 


110) = 2.20, ns 


.16 


F(l, 


110) = 0.92, ns 



Note. Pre = pretreatment; post = posttreatment. 



ing in a test with one df. Testing whether effects at follow-up 
differed from those at posttest were conducted by comparing the 
unconstrained model to one in which the differences training 
minus control were constrained to be equal for both latent change 
factors, resulting in a test with one df. 

Model fits were acceptable for reasoning, x^O^) = 83.91, 
root-mean-square error of approximation (RMSEA) = .05, and 
episodic memory, x^(75) = 93.61, RMSEA = .07, but not for the 
model of WM near transfer tasks, even if only strong measurement 
invariance was modeled, x^(60) = 106.73, RMSEA = .12. We 
therefore refrain from interpreting results for WM at the latent 
factor level. 

Latent effect sizes were calculated by dividing the latent mean 
differences by the latent SDs at pretest. For analyses of the BIS 
test, tasks were parceled for each ability construct by calculating 
composites of standardized scores for the three tasks of each 
content domain. As these scores were thus already standardized 
based on pretest SDs, mean differences are in effect-size metric 
and do not need to be divided by SDs. 

Results 

In the following, we focus on long-term transfer effects at the 
latent factor level and restrict our analyses to those transfer effects 
for which we found significant results at posttest for the younger 
adults (Schmiedek, Lovden, & Lindenberger, 2010); that is, for 
latent factors of reasoning and episodic memory. Results on trans- 
fer effects at the observed task level are reported in Table I. 

For the latent factor of reasoning, there was a significant inter- 
action of experimental group and occasion, x^(2) = 15.54, p < 
.001. The latent net effect sizes were .17, x'(l) = 7.41, p = .006, 
at posttest and .23, x^(l) = 14.57, p < .001, at follow-up. The 
difference of these effects was not reliable, x^(l) — 1-12, ns. As 
shown in Figure 2, this was due to relative stability of latent means 
for both the trained and the control group. For the latent factor of 
episodic memory, there was a significant interaction of experimen- 



tal group and occasion, x^(2) = 31.45, p < .001. The latent net 
effect sizes were .47, x^(l) = 30.48, p < .001, at posttest and .18, 
X^(l) = 3.88, p = .041, at follow-up. The difference of these 
effects was reliable, x^(l) = 1 1.54, p < .001. The reduction of the 
effect was mainly due to a reduction of the effect in the trained 
group (see Figure 2). 

In sum, the results at the latent factor level show that the 
improvements at the ability level for reasoning and episodic mem- 
ory were (a) significant at posttest for the reduced follow-up 
sample, (b) significant at the 2-year follow-up, and (c) signifi- 
cantly reduced at follow-up, in comparison to transfer at posttest, 
for episodic memory, but not for reasoning. Group differences in 
motivation are unlikely to be the cause of these effects, as self- 
reported motivation to work on the tasks did not differ signifi- 
cantly between the training and control groups (see Figure 3). 

Discussion 

The present results show that far transfer to broad cognitive 
abilities can be maintained over several years. The sizes of the 
observed reliable effects were not large. However, their breadth 
renders them beneficial for a number of real-life outcomes. As 
reasoning and episodic memory are abilities of high predictive 
validity for everyday competency (Tucker- Drob, 2011), even 
small effects can have a substantial impact on performance in 
educational, professional, and leisure activity settings. Training 
interventions that lead to small effects of wide scope and high 
temporal stability may pay off more than interventions that lead to 
strong but specific effects that do not last for long. 

Regarding reasoning, transfer effects at follow-up were signif- 
icant at the observed task as well as at the latent ability level and 
of comparable size as at posttest. While for episodic memory, 
transfer effects were not significant anymore at the observed task 
level for verbal, numerical, and figural-spatial memory at 
follow-up (see Table 1), the effect at the level of their common 
factor was reduced in comparison to the posttest effects, but still 
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Figure 2. Latent means and associated standard errors for the training and control groups at pretest, posttest, 
and follow-up. Training group shown with solid lines, control with dashed lines. A: latent factor of reasoning; 
B: latent factor of episodic memory. As the indicator tasks of the latent factors were standardized by SDs at 
pretest, latent means are in effect size metric. 



maintained reliable. This further demonstrates the usefulness of 
investigating transfer at the latent factor level. At the observed task 
level, performance is measured with imperfect reliability due to 
measurement error and might be influenced by task-specific strat- 
egies that have been acquired during the training, but could not be 
reactivated in an effective manner after 2 years. As the latent level 
only captures sources of variance that have a general influence on 
all indicator tasks of the factor, general effects, if present, are more 
easily detectable there. 

How did transfer to broad cognitive abilities come about, and 
how was it maintained over the considerable period of 2 years? We 
hold that plasticity at the neural level requires a sustained chal- 
lenge of the cognitive system produced by a mismatch between 
cognitive demands and functional supplies (Lovden et al., 2010). 



The breadth (12 heterogeneous tasks that differed in content and 
paradigms), intensity (high difficulty due to adjustment to individ- 
ual performance levels), and dosage (100 sessions of about 1 hr 
duration) of the training fulfills this requirement and could thereby 
lead to plastic brain changes, for example, in gray matter (Dra- 
ganski et al., 2006), white matter (Scholz, Klein, Behrens, & 
Johansen-Berg, 2009), and neurotransmitter systems (Backman et 
al., 2011; McNab et al., 2009). For a subsample of COGITO 
participants, Lovden, Bodammer, et al. (2010) have found indica- 
tions of improved white-matter microstructure as well as increased 
volumes of the anterior corpus callosum at posttest. Little is known 
about the temporal stability of plastic neural changes, and we do 
not know whether and how they help to preserve positive transfer 
in broad cognitive abilities. 
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Pretest Posttest Follow-up 

Figure 3. Self-reported motivation to worl; on tlie taslfs at pretest, post- 
test, and follow-up for the training and control groups. Participants an- 
swered the question "I tried to do well on the tasks" on an 8-point scale 
(0 = does not apply at all, 1 = does apply very well) at the end of the 
session in which they had worked on the Berlin Intelligence Structure test. 
This information was available on all three occasions for 71 participants 
from the training and 3 1 of the control group participants. Solid and broken 
lines show means for the trained and control group, respectively. Error bars 
denote standard errors. While the main effect of occasion was significant, 
F(2, 202) = 4.69, p = .010, neither the main effect of group, F(l, 201) = 
2.88, ns, nor the interaction of group and occasion, F(2, 202) = 0.03, ns, 
was reliable. 

In addition to plastic changes at the neural level, we also need 
to consider rather complex reciprocal effects among the develop- 
mental trajectories of cognitive and other psychological variables. 
Improved cognitive abilities may open opportunities in the educa- 
tional and professional paths of younger adults that in turn lead to 
continuously raised levels of cognitive demand, which may help to 
perpetuate the beneficial effects of the training. Similarly, in- 
creased cognitive capacities might lead to an increased need for 
cognition (Cacioppo, Petty, Feinstein, & Jarvis, 1996) or openness 
to experience (Jackson, Hill, Payne, Roberts, & Stine-Morrow, 
2012) that makes participants seek and face cognitive challenges in 
their lives. Findings of long-term benefits of early education pro- 
grams that sometimes last decades after the intervention programs 
have ended (Bamett, 2011) underscore the importance of taking a 
developmental perspective on cascading outcomes of training in- 
terventions. 

The finding that latent transfer effects were reduced at follow-up 
for episodic memory, but not reasoning, speaks to the possibility 
that the acquisition of general strategies might also have contrib- 
uted to the findings for episodic memory at posttest. Besides the 
influence of task-specific strategies, which should not influence 
findings at the latent factor level, our participants might also have 
acquired and practiced more general strategies, like mental imag- 
ery, that are supportive for a broad selection of episodic memory 
tasks. Difficulties with an ad-hoc reactivation of these strategies at 
follow-up might explain the reduction of transfer effects. As no 
reasoning tasks were included in the training and as potential 
strategies used with the practiced WM tasks are much less likely to 
be of help for performance on the transfer reasoning tasks, a 
strategy-based explanation of the transfer to reasoning is difficult 
to entertain. 



In sum, the present findings provide room for cautious optimism 
(cf. Hertzog et al., 2008). Cognitive trainings can produce transfer 
effects that are sufficiently large in scope and stable over time to 
justify the considerable effort that is needed to produce them. 
Future studies should hold up the proposed standard of investigat- 
ing transfer at the level of latent ability factors and improve on the 
investigation of the mechanisms that produce transfer and main- 
tenance. Future research will need to take close and continuous 
looks at postintervention developmental trajectories on behavioral, 
social, and neural dimensions to better understand the conditions 
under which cognitive training interventions can trigger a cascade 
of changes that result in improved or maintained cognitive com- 
petence. 
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